perm filename TALK2[AP,DBL] blob
sn#104891 filedate 1974-06-05 generic text, type T, neo UTF8
00100
00200
00300
00400 Stanford University, Computer Science Department
00500
00600 Automatic Programming Group Meeting
00700
00800 Thursday, June 6, 1974
00900
01000
01100
01300
01400
01500
01600
01700 PUP5
01800 Continuation of a talk on a program-understanding program
01900 by Doug Lenat
02000
02100
02200
02300 1. Review
02400 2. Implementation details
02500 Precisely what knowledge you need to write induction programs
02600 How to actually construct a working system: Protocol
02700 Deviations from the theory: confessions
02800 Program parameters
02900 3. Questions one should ask about any automatic programming system
03000 Does it write one program?
03100 What is the input dialog like?
03200 Does it write a second program?
03300 How difficult is it to extend it to write more programs?
03400 Where inside it is the program really stored?
03500 4. Results
03600 Input: the user-pup dialogue
03700 Output: the CF program it produced
03800 5. A second task is chosen: grammatical inference
03900 The program it should write
04000 What new beings, functions, and changes to old PUP5 system are
04100 required? How hard will this new task be?
04200 6. Extensions
04300 What programs could be written with only minor modifications?
04400 What programs are very poorly suited to PUP5?
04500 7. Conclusions
04600 Comparison of PUP5 to other systems: size and time of system and of
04700 target programs. Relative performance.
04800 Disussion of effort expended and it worth, for each phase of the
04900 project: conception, protocol, programming, debugging,
05000 future target programs
05100 Was it the right way to solve the problem?
05200 Was it the right problem?
05300 Future directions
05400 immediate: more target programs, by more users
05500 theoretical: analysis of class of programs well/poorly
05600 suited to attack by PUP5 ideas.
05700 sometime: clean up system, more faithful to the ideas.
05800 destiny: what is the impact and the future of the ideas
05900 our group: coordination with other systems
06000 ease
06100 utility
06200 who's in control
06300 do you really want a ten page sort program?
00100 The knowledge necessary to write a concept formation program
00200
00300 A. High-level, domain-specific knowledge
00400 CONCEPT-FORMATION
00500 The user must be aware that we are about to undertake concept formation.
00600 Inference and attention-focussing demons must be turned on.
00700 After completion of this task, PUP will be able to learn concepts.
00800 This is a specialized form of attending, learning, and doing inductive
00900 inference. It is an alternative to grammatical inference, pattern
01000 recognition, and simulated evolution. Its structure must be one of the
01100 following: classificatory concept formation, comparative concept formation,
01200 or metrical concept formation. We must make the boolean decision as to
01300 whether or not concepts may vary with time. Similarly, whether the speed
01400 of presentation of the stimuli is relevant; if so, then we must constrain
01500 the effort spent on various phases of identification. Instances may be
01600 left in view indefinitely, or may be removed right after processing;
01700 this latter casee holds for CF; it means we must derive all relations
01800 (features) as soon as we see a scene. The program will have to be just
01900 complex enough to handle conjunctive, disjunctive, or both kinds of
02000 concpts; this is another decision to make. Similarly for positive,
02100 negative, both, or neither kinds of transfer (psychological), which
02200 affects the recognition that a concept is new, and how previously
02300 learned concepts interact with the learning of new ones. We must whether
02400 to use positive, negative, or boht kinds of instances of a concept.
02500 Subject-specific behavior may be required; that is, the program may
02600 have to input a vector describing a particular individual, and its
02700 whole structure must mimic this subject. The last decision is one of
02800 adapting the program to an extended sample dialogue which the user must
02900 furnish; this will help both to check out the program writtten, and to
03000 fix various print statements and I/O formats. It is easy to call this
03100 being (.1), it has a 50-50 chance of calling* itself, it has only a
03200 .5 chance of succeeding, but the effort to try it is moderate (.5).
03300 There is no fundamental reason for delaying its investigation (.1).
03400 It recognizes itself only through exact matching of one of seven
03500 patterns. It has sentences describing what it does, how, and why.
03600 It is unlikely (-70) to be called if some type of concept formation
03700 is already doable by PUP; if PUP wants to characterize classes, then
03800 it's very likely (88) to be called. The presence of new information
03900 delays (-60) our calling the being, since it might affect what we
04000 should do. Conversely, the absence of new information mildly (40)
04100 encourages us to go on and try it. When finished, the user must be
04200 aware that PUP has decided on a particular type of concept formation,
04300 and that it has done it. The other beings affected depend on the
04400 decisions mentioned earlier.
04500
04600 This is an over-abundance of information. From now on, I will
04700 only give the little pieces of information which are crucial to the
04800 writing of some part of the CF system.
04900
05000 CLASSIFICATORY CONCEPT FORMATION
05100 To do this, we must partition a domain, in an interactive "guessing"
05200 manner.
05300
05400 COMPARITIVE CONCEPT FORMATION
05500 Same as above, but then we must partially order the equivalence classes
05600 of the partition. It is harder, also.
05700
05800 METRICAL CONCEPT FORMATION
05900 Same as previous being, but we must also induce a metric on the
06000 partial ordering of the classes. This is even more complicated.
06100
06200 Since we actually do classificatory CF, the beings to order
06300 a partition and to metrize an ordering weren't implemented.
06400
06500 PARTITION A DOMAIN in a guessing, interactive manner.
06600 The partition may be only partial, it may be only weak, and, most
06700 crucially, the being must be able to do some of these: partition by
06800 accepting an element and getting its class name (guessing the name
06900 and then checking it somehow), partition by accepting a class name
07000 and getting its member elements, partition by accepting ordered
07100 pairs <element, classname>. The fringe of conciousness demon must
07200 be activated from now on.
07300
07400 PARTITION BY TAKE ELEMENT GET CLASS
07500 Take hold of an element (by reading, for example), and then work to
07600 get the name of the class it belongs to (by guessing, then verifying
07700 the guess, for example). Then modify the structure of the class(es)
07800 involved.
07900
08000 PARTITION BY TAKE CLASS GET ELEMENT
08100 Take hoold of a class name, and then work to get elements it contains.
08200 Then modify the structure of the class and the element(s) involved.
08300
08400 PARTITION BY TAKE ELEMENT AND CLASS simultaneously.
08500 Take hold of both and element and its corresponding class name, and
08600 use these to modify the structure of the partition; i.e., the class
08700 mentioned if the partition is stored by classes.
08800
08900
09000 B. Low-level, domain-specific knowledge
09100
09200 RECOGNIZE CONTRADICTION
09300 It can translate (... is incompatible with ...). It is a predicate,
09400 fairly easy to write therefore. It is composed of SOMEOF the following:
09500 Probability≡0 contradiction, Probability≡1 contradiction, Probability
09600 >0 and <1 contradiction. Since these names are fairly cryptic, some of
09700 their parts (e.g, HOW) must be printed out to help the usr choose.
09800
09900 PROBABILITY ≡ 0 CONTRADICITON
10000 Since this is a very simple thing in our domain of concept formation,
10100 it is immediately encodable as (MEMBER arg1 arg2). That is, if arg1
10200 has probability zero of occurring in arg2, yet it does, then we have
10300 a contradiction.
10400
10500 PROBABILITY ≡ 1 CONTRADICTION
10600 Immediately encodable as (NOT (MEMBER arg1 arg2)). If arg1 has probability
10700 one of occurring in arg2, yet it doesn't, then we have a contradiction.
10800
10900 PROBABILITY >0 AND <1 CONTRADICTION
11000 Immediately enacodable as NIL. If arg1 might and might not occur in arg2,
11100 we can't get a contradiction just by checking its membership. Of course,
11200 the idea that this is the only way to prove contradiction is what makes
11300 these beings domain-specific!
11400
11500 SCENE
11600 This is a data structure, composed of four subparts. The first is
11700 a set O of objects. Next is an atom indicating the name N of the scene.
11800 Next come two lists of features, where a feature is just a predicate
11900 and its arguments. The first is the static relations S between objects.
12000 Finally we have the dynamic relations D between objects.
12100
12150
12200 C. Ubiquitous, problem-independent beings and functions
12300
12400 CHOOSE FROM
12500 All its arguments must be beings, else it prints a nasty warning message.
12600 We select the best being and apply it. If it fails, we re-order the
12700 remaining beings and apply the best one of them. Note that this new
12800 reordering may use knowledge gleaned during the earlier, unsuccessful
12900 being try. Thus, this is a (possible) intelligent nondeterministic
13000 branch point. The intelligence lies (or fails to lie) in the comparison
13100 function, BETTER, which decides who goes next.
13200
13300 SATISFY
13400 This is the equivalent of a pattern-directed goal statement. We ask
13500 each being, "Can you do anything matching x?". We take the list of those
13600 answering affirmatively, and CHOOSE FROM it one being after another
13700 until the desired effects are realized. Notice that a being who said
13800 "probably" may succeed in his application and yet not effect the result
13900 we wanted, so that a trivial call on CHOOSE FROM is insufficient.
14000 The beings possible affected are just those answering affirmatively.
14100
14200 MESSAGE
14300 This being has a main effect (AWARE USER x), hence is very frequently
14400 called. The forgetful-user demon trims the aware user list periodically.
14500 Message looks to see if its message is on that list; if not, it inserts it
14600 and prints it out to the user. If it is, it moves the message to the front
14700 of the aware-user list and prints out nothing.
14800
14900 DETERMINE ARG VALUES
15000 These functions take as input the name of a function, and output a
15100 description of what arguments it will ever be called with (in the existing
15200 code.) For example, it might say "arg1 will always be NAME:OF:CLASS, and
15300 arg2 will consecutively be all integers from 3 to (LENGTH SET:OF:CLASSES)".
15400 At present these work in the obvious way, looking at everything. The
15500 tremendous amout of CPU time spent in these functions indicates that I
15600 should have made special assert:lists for argument instantiations, and
15700 updated them each time a being is called in the target code.
15800
15900 FLOW-PRECEDED
16000 This being must search through he code to find a form matching a given
16100 pattern. Although it is used under ten times in the total dialog, it is
16200 so costly that I've implemented it as an ask-the-user call. Work must be
16300 done here to understand why this is so inefficient, and to remedy it.
16400
16500 FIND AND TAG
16600 This being is similar to flow-preceded, except that the pattern-matching
16700 is between two constant strings. This is tolerably efficient in CPU time
16800 and is used heavily throughout the writing of CF.
16900
17000 TRANSLATE
17100 The natural language front end is managed by this being. It asks each being
17200 whether it recognizes a given string. Translate then takes the "best" --
17300 the most probable -- of these as the translation, and can backtrack and
17400 reorder the remaining interpretations if it has to. If called with no
17500 argument, it examines various assert-lists to see if it can do any good.
17600 The idiom demon must be activated during the control period of this being.
17700
17800 REINVESTIGATE DECISION
17900 This is usaully called by a demon who watches the deferred-decision
18000 assert list. We transfer the decision in question from the deferred to
18100 the undeferred decision assert list. A deferral demon will promptly
18200 react to anything on this latter list. An interesting caution: it was
18300 necessary to inhibit all demons during the execution of this being, for
18400 reasons very similar to those leading to lock and unlock system commands.
18500 THe fact that some being might have to be demon-uninterruptable forced us
18600 to institute an entire new question asking just about this tiny point!
18700
18800 DEFER DECISION
18900 Remove the decision from the undeferred decision assert list.
19000 Determine the situation when we must next reinvestigate this decision.
19100 This will be some predicate examing the state of the world. If this
19200 predicate is true currently, we must resolve the decision now. Otherwise,
19300 we put the decision on the deferred decision list, attached to its (new)
19400 reinvestigation predicate. Demons must be inhibited during this being's
19500 reign, to ensure that it's notions of the world are accurate upon exit.
19600
19700 WHEN NEXT
19800 Manipulate the decision to extract the name of the variable holding
19900 information relevant to deferring the decision. Utilize this knowledge
20000 so as to keep the effects of the decision irrelevant; i.e., find the
20100 (next) situation in which they are not irrelevant. Whoever called
20200 this being is now asserted to be aware of its results. THis is fairly
20300 complex, and the being is not called unless it is necessary. As it happens,
20400 it is called a few times for every decision to be made about every being
20500 in the target program (several hundred times).
20600
20700 UTILIZE
20800 This being applies various knowledge "variables," starting at specific
20900 ones and moving toward very general ones, until one of them reports
21000 being able to acheive the desired goal.
21100
21200 RESOLVE DECISION
21300 Again, all demons must be inhibited. After some preliminary searching
21400 and very trivial theroem-proving fail, this being resorts to asking the
21500 user about how to resolve a given decision.
21600
21700 ASK USER ABOUT
21800 We determine the argument instantiations of the little piece of code
21900 we're worrying about, determine the type of decision to be made, and
22000 apply the specific knowledge variable for x-ing that type of decision.
22100 Here, we get x by examing who called this being and why. To write a
22200 specialized version of ask-user-about, we just write a standard
22300 print, read, and assign function, with the details left unspecified
22400 until the sample session is read in.
22500
22600 BETTER
22700 This function is used to compare two beings, and decide which of them
22800 should gain control. It evaluates their WHEN parts, and if they tie
22900 it evaluates their complexity vectors. Note that "eval" here is not
23000 trivial: each dimension of the complexity vector of a being can be a
23100 little program which examines itself, other beings, and the state of the
23200 world before deciding on a numerical answer to rturn.
23300
23400 Handling of User Interrupts
23500 There are several functions and beings involved in this process.
23600 Initially, the user describes how often the system is to give him the
23700 opportunity to interrupt and query it. At each of these times, the
23800 HANDLE USER INTERRUPT function asks the user if he wants to interrupt;
23900 if so, PROCESS USER INTERRUPT is called to do the job. In addition to
24000 asking for pieces of any being, the user may request limited simulated
24100 execution of various pieces, and may order the current being to FAIL.
24200
24300
24400 D. High-level, problem-independent knowledge: how to write programs
24500
24600 SERVE
24700 Obtain information until some of it is "executable," then carry it out.
24800 The forgetful-user demon is activated. Without this top-level purpose,
24900 PUP5 sits contentedly, never wanting to accept a new task.
25000
25100 WRITE PROGRAM
25200 The user must be made aware that PUP is about to write a program, what
25300 kind of program it is, what its name is (this will force a get-name
25400 being call), and that its type has been examined (this will cause a
25500 study-type being call). Upon exit, he must be told that PUP has completed
25600 the task, and what its new capabilities are. To wite a program, one enters
25700 a loop, broken only when several completion conditions are all true
25800 simultaneously: the top-level task is now a being, there are no undefined
25900 sections of code, there are no warnings left about the code, there is
26000 no executable information anywhere, ther is no new but unprocessed
26100 information, there are no decisions still pending (except those requiring
26200 "everything else" to be complete; e.g., the adaptation ofoutput formats
26300 using a sample session). If we do break out of the loop, we must update
26400 the list of programs written, the list of what PUP can now do, of what
26500 the user may do, we find the set of support of the top-level task and
26600 create a new file with the relevant functions and beings (which auto-
26700 matically does global initializations and then enters the top-level
26800 task instead of SERVE). In general, of course, we won't break out,
26900 so we activate all the current demons and go on. All the body of the
27000 loop is is one CHOOSE FROM, between six alternatives: obtaining some
27100 usable info, using some usable info, filling in some function call which
27200 is currently undefined, clarifying some little piece of code known to
27300 be improbable for some specific reason, adapting some known function
27400 to conform to some specific new requirements, fixing some piece of code
27500 which doesn't work the way it claims to work. The last two of these are
27600 simply program modification and debugging, respectively! Failure of one
27700 of these six beings simply causes CHOOSEFROM to try another one; failure of
27800 a demon causes the whole WRITE PROGRAM being to fail. During its reign,
27900 the program-writing demons, deferral-demon, ad reinvestigation-demon are
28000 all activated. Its complexity vector is dependent upon that of the
28100 being closest to the task it must perform.
28200
28300 OBTAIN USABLE INFORMATION
28400 The WHEN part informs us that this is always undesirable (-10) but
28500 is OK (111) if there exists new but not yet usable information. All
28600 we do here is CHOOSE FROM the following four alternatives: translate
28700 something, get brand new information, analyze the implications of
28800 existing information, extract a small relevant subset of the existing
28900 information to concentrate on.
29000
29100 USE INFORMATION
29200 This demands that some executable information exist. We select one such
29300 piece, and try to execute it. If we fail, its worth is decreased; if we
29400 succeed, it is removed from the executable info assert list.
29500
29600 FILL IN UNDEFINED SECTION
29700 THere must be some undefined section. If so (80) we don't want any
29800 hi-priority (≤20) coding warnings still around (-150), and we do like
29900 there to be something both undefined and known to be encodable (96).
30000 We fix a choice of what to encode, and try to acheive its encoding.
30100 If we fail, we update the difficulty of that choice, and may assert that
30200 we want some specific new information to relieve the problem. In addition
30300 to ENCODE, this being also may call MAKE ENCODABLE and STUDY TYPE.
30400
30500 CLARIFY IMPROBABLE SITUATION
30600 This being demands that something of mediocre priority (≤500) exist on
30700 the coding warning assert-list. It likes (51) this, and dislikes (-41)
30800 anything on the undefined section list, or anything (-200) on the encodable
30900 section list. As always, a sentence is provided to justify each of these
31000 little beliefs. We choose the warning with the highest priority (lowest
31100 numerical weight) on the coding warning list, note that that is what PUP
31200 is working on now, and do a match to decide what type of warning it is.
31300 (i) Replace x by y in z
31400 Here these may be nonspecific; z may be "in all code recently generated".
31500 The nature of y may cause us to include new warnings; y may mention a new
31600 data structure.
31700 (ii) x in y is undefined; probably z since r
31800 This may cause us to add to the global initialization list. It will
31900 probably cause us to ask the user what the answer is.
32000 (iii) x is a data structure but we don't know much about it
32100 We try to find out its structure, how to initialize, access, insert,
32200 delete, update it. A variant of this warning is:
32300 (iv) We find no x's associated with data structure y
32400 Here x can point to insertions, deletions, initializations, etc.
32500 If we can't justify the lack, we try to defer the decision. Failing
32600 that, we ask the user.
32700 (v) Command if x then y
32800 This is a programmed demon; when situation x is true, we must do y.
32900 (vi) Delete all mention of x
33000 This is like a replace, but we go through the assert lists with an eye
33100 toward deleting unnecessary worries.
33200 (vii) Infinite loop in x from y to z
33300 If we can't justify this, we insert a test to break out of the loop.
33400 Justification might be that this loop is in the top-level function of
33500 the system, where we never wnat to break out.
33600 (viii) Incomprehensible (there is a "bug" in x manifesting itself as y)
33700 Never needed to write CF, so not implemented. SHould call FIX INCORRECT
33800 PIECE (which is also not in yet) or ask the user for assistance.
33900
34000 GET NEW INFORMATION
34100 Naturally, it is not thrilled if (-68) there exists some new but
34200 unexamined information, and it is happy (50) if there is none.
34300 The prerequisites ensure that the user is aware of what PUP wants,
34400 and if the theorem prover can't deliver it, PUP asks the user for some.
34500 If PUP asks for something general ("any task") it is because it knows
34600 precisely that this is what it wants, not out of ignorance! During
34700 execution, the specificity check demon is active; he ensures that
34800 it is indeed phrased as specifically as possible; if not, MAKE SPECIFIC
34900 will be called. This is a very uncomplicated being, and a very unpopular
35000 one to use since we should squeeze every drop of meaning out of what
35100 we have before asking for more information.
35200
35300 EXTRACT RELEVANT SUBSET
35400 This likes (70) there to be a great deal (≥50 pieces) of new information,
35500 and dislikes (-80) it if there are under a dozen such tokens.
35600 It finds and evaluates knowledge variables to constrain what should be
35700 looked at currently. This is never called in the dialog, though it was in
35800 the protocol.
35900
36000 ANALYZE IMPLICATIONS
36100 The WHEN part is unhappy (-60) if there is usable information already,
36200 since this being is fairly costly. It also examines the size of the
36300 new info list to see just how long this search will be. The being locates
36400 the code that will be affected, predicts the affects, and sees how
36500 desirable this is. This being is also never needed to write CF.
36600
36700 MAKE ENCODABLE
36800 If all else fails, this being tries replacing a function by one of its
36900 alternatives or, as a last resort, by one of its generalizations. This
36950 last resort is never needed to write CF.
37000
37100 ENCODE
37200 Despite its key name, this being is mostly a bookkeeper! It runs around
37300 the assert lists, gathering up enough information to encode a specified
37400 little piece of code. A program schema is built up, instantiated as much
37500 as possible, printed out for the user to refer to, and passed to a highly
37600 optimized recursive auxilliary function, GETCODE. Some woorying about the
37700 arguments is done,including what they might be instantiated as. We inform
37800 the user of the code being written, and a prerequisite causes the function
37900 to become made into a being.
38000
38100 STUDY TYPE
38200 This being accepts a being call, looks at that being's specializtions
38300 part, translates each entry into a decision, and dumps these onto the
38400 undeferred decision list. A deferral demon promtly attends to them.
38500
38600 GET DATA STRUCTURE
38700 We call select-structure type, and use its advice to point to the
38800 proper knowledge variable. We eval it to set up the various parts of
38900 the data structure. In unclear cases, we may ask the user whether the
39000 argument really is a data structure. We ensure that this object is
39100 a being (else we make it one,) and we add warnings to the effect that
39200 there might not be any insertions or deletions; we'll worry about that
39300 much later, by another being. We know that the elements of a data
39400 structure are themselves usually data structures, so one the prerequisites
39500 says that we must try to make the arguments into data structures as well.
39600
39700 GETCODE
39800 This is a long, special-case, recursive function. It goes through a piece
39900 of code with two jobs: to build up a list of arguments to this function,
40000 and to get names for specific instances of general beings.
40100 Part of the kludgy character is due to the fact that there are non-beings
40200 in the code: primitive forms (structure, tuple, vector, class, comment,
40300 atoms), primitive functions (read, null, and ,...), primitive variables
40400 (t, nil, 4, ...). By keeping closely to the theoretical ideals implicit
40500 in the ideas, this function would probably become trivial.
40600
40700 GET NAME
40800 First we study plausible names for the new specialized being. We make the
40900 user aware of them. We examine the being name carefully, to see if it
41000 was just mentioned in the same piece of code (probably then the same name),
41100 whether it's ever been mentioned and specialized, and so on. Sometimes we
41200 end up deciding not to get a new name, sometimes we pick one and just tell
41300 the user, sometimes we recommend some old specialized name. In 90% of the
41400 cases, though, we simply say "give me a name for ...". A new-name counter
41500 is maintained to ensure unique names, and this number is appended onto the
41600 end of what the user types, uunless it's an already-existing being name.
41700 The user my type ] (and usually does), indicating that PUP is to choose.
41800 PUP always informs the user what the name is at the end.
41900
42000 MAKE NEW BEING
42100 This being has the awesome responsibility of giving life to new beings.
42200 As is the case with neurons, soldiers, and all (good) beings, no one being
42300 really does much when you look at it in isolation. Thus this one already
42400 gets the name of the being, the name of its parent (its least general
42500 generalization), how the metacodes of these tow differ, the arglist,
42600 the reason for this specialized being to exist, what extra qualities it
42700 possesses or lacks wrt its parent, etc. It can figure out who it affects
42800 by examing its various parts, and it bases the complexity vector on
42900 that of the parent and on the changes in this new being. Thus it basically
43000 does gets, evals, and puts. It updates various assert lists, and semi-
43100 compiles the new being. One bit of knowledge says that the explicit args
43200 check may be significantly different, and the user may be queried.
43300
43400
43500 E. Low-level problem-independent knowledge: bits of programs
43600
43700 PROPOSE PLAUSIBLE NAMES
43800 This being is called by GetName, and should incorporate a good model of user
43900 psychology of name-choosing. Currently, it applies Initials, Mainwords,
44000 Firstfew, and compositions of these, to the task description sentence.
44100
44200 SEMI COMPILE
44300 Basically, beings only lend themselves to interpreting; to help speed up
44400 this process, this being can take advantage of regularities and simplicities
44500 in anohter being's parts, and turn out a fast function to do an equivalent
44600 job. For example, if a being doesn't enable any new demons, we needn't
44700 push the demon stack at the beginning and pop it at the end.
44800
44900 SELECT STRUCTURE TYPE
45000 This is a true bit of programming knowledge: if the structure is composed
45100 of several weakly-interacting pieces tied together through association with
45200 one atom, then the data structure should be a list of these atoms, with
45300 the associated structures being stored on the property lists of the atoms.
45400 If there are only a couple pieces, or they interact strongly, we should
45500 use a nested list structure instead.
45600
45700 ELEMENT
45800 All we know about an element is that it is a member of a data structure,
45900 and that we should not be ashamed to ask the user to clarify himself if
46000 he mentions this. The ConceptFormation being should note that future
46100 refernces to element actually refer to a scene, an instance of a concept,
46200 and that references to class refer to the model of a concept, a set of
46300 scenes. This may change as new data structures come into existence.
46400
46500 MODIFY STRUCTURE
46600 Generally, we will be given a typical element, and must identify the
46700 structures it blongs to, and modify them. The precise details indicate that
46800 some subset of CONDITIONAL INSERTION, CONDITIONAL DELETION, and COMPLEX
46900 ALTERATION will be involved.
47000
47100 GET HOLD OF by guessing and checking
47200 We must discover whether an algorithm already exists which can get arg1.
47300 If not, we try to find one which can get arg1 and some other effects. We
47400 structure the function as some of COMPUTE, GENERATE&TEST, and SEARCH.
47500 Finally, we must decide now on the type of error recovery desired: none,
47600 backtrack, or correct-and-try-to-proceed.
47700
47800 TAKE HOLD OF trivially
47900 We examine the flow of control through the preceding code, to decide
48000 whether arg1 has ever been SET. If so, we must ask the user whether or not
48100 to read in a new value. If no read is to be done, then this being reduces
48200 to a simple assignment, or perhaps to nothing at all. Otherwise, we read
48300 in the object, and assign its various subparts to SOME PART OF it.
48400
48500 IS OF TYPE
48600 This being is a predicate which is too low-level and general to do much.
48700 Basically, it helps formulate a question to the user, who must explain
48800 to PUP how to construct any predicate it comes across, usually just by
48900 translating a sentence the user types in.
49000
49100 COMPLEX ALTERATION
49200 This being is always replaced by some initializing assignments, followed
49300 by calls on MODIFY STRUCTURE for some subparts of arg1. A bit of advice
49400 is that if arg1 is unstructured, try to get it structured. If this fails,
49500 maybe what is really wanted is to modify the structure of which arg1 is
49600 a member.
49700
49800 CONDITIONAL DELETION
49900 As above, we look at the structuring of various arguments to decide what
50000 is really supposed to be deleted from where. We check with the user,
50100 remind him of various bindings relevant to the current call, and ak him
50200 to describe (1) under what conditions we do the deletion, (2) what exactly
50300 do we delete. These are translated, analyzed in the context of deletion,
50400 and help determine the deletion piece of code.
50500
50600 CONDITIONAL INSERTION
50700 This is similar to the preceding being. Here we also worry about
50800 whether the entity to be inserted has ever been bound. If not, we must see
50900 that it is! Often, this binding will be related to the Initialize part of
51000 the DataStructure piece of the being representing the structure we're
51100 inserting into. Since some data structures have several similar but
51200 distinctly-named elements existing simultaneously, we have lots of little
51300 worries. After we have planned out the code, we check with the coding
51400 warning list and add to it; e.g., any undefined initial value of a variable
51500 in a piece of code we stuck in here, will also be uninitialized here. If
51600 we later decide never to worry about the first initialization, we must
51700 not forget this one! This is a frequent source of bugs, I think.
51800
51900 GENERATE&TEST and COMPUTE are not needed or implemented.
52000
52100 SEARCH
52200 Currently, a primitive type of searching suffices. We simply execute
52300 the iteration FOREACH X IN XSET DO (TEST X). If we're unsure, we check
52400 that XSET has the form SET OF (PLURAL X). The specializations tell us to
52500 worry about termination conditions. I suppose this being could also be
52600 used for generate-and-test.
52700
52800 FOREACH
52900 Not surprisingly, this is the iteration being. It is an NLAMBDA function,
53000 so its arguments aren't surrounded by quotes. There are various minor
53100 decisions to make, which simplify any specialized version:
53200 there may or may not be an "until" condition; we must get it and also
53300 decide what to bind the iteration variable to and what value to return,
53400 both in case this condition is met, and in case we exhaust the space.
53500 Often, we will decide to leave some of these unspecified, put notes about
53600 them on the coding warning list, and not worry about them for a long time.
53700
53800 TEST
53900 This being is a predicate which is oriented toward a threshold of accepta-
54000 bility, whereas ISOFTYPE is oriented toward separating cases. It either
54100 has the flavor of comparing, or of competition. We must also decide
54200 whether the result is nominal, ordinal, or of ratio caliber. These latter
54300 two never crop up, which is why we assume the test is a predicate always.
54400
54500 SOME PART OF
54600 We either get this simple structural function by examples (like SHaw's
54700 program) or by translating a user-supplied English sentence describing
54800 what it does. Typically, some combinations of CAR, CDR, CONS, and NIL.
54900
55000 COMPARE
55100 We must worry about whether the result is nominal, ordinal, or ratio.
55200 We also decide whether the comparison is a JOINING FUNCTION applied to
55300 its arguments, a constant function (executed only for side effects),
55400 or a JOINING FUNCTION applied to the results of COMPAREing corresponding
55500 subparts of the arguments. This last case is both common and complicated.
55600 If neither argument is structured, this is impossible. If both are highly
55700 structured, their structures must have a nontrivial amount of correspondence
55800 in order to succeed. If only one argument is structured, this strongly
55900 suggest that the other one should be similarly structured. Often we go
56000 ahead and structure it without asking the user.
56100
56200 BETTER
56300 We've discussed this earlier. Here we should note that when called on
56400 to write a new, specialized BETTER function, it chooses either a simple
56500 function (T, NIL) to allow an optimization (replace by CONS, replace by
56600 NCONC1), or it chooses a complex comparing function (e.g., alphorder,
56700 write-program itself!).
56800
56900 JOINING FUNCTION
57000 This is a way of condensing results. It is typically a known function,
57100 asuch as AND, OR, PLUS, MAXIMUM, PROGN... which is determined by (i) the
57200 character of the result (numerical, T/F) and (ii) whether we are in
57300 a DO UNTIL loop, a DO REPEATEDLY loop, or neither of these loops.
57400
57500
57600 F. Programming Knowledge stored in variables
57700
57800 To resolve an ADAPTATION decision
57900 Ask for sample dialog, symbolically run current code, modifying I/O formats.
58000
58100 To defer any decision whose AFFECTS part is known
58200 It may translate as some detail of x; in that case, wait until some code for
58300 x already exists. The affects part may translate as The x algorithm; in this
58400 case we worry as soon as PUP begins DO-ing any coding at all of x.
58500
58600 To defer an ALTERNATIVES decision
58700 Examine the coding tasks in all cases, and try to find some common head
58800 and/or some common tail. If there is any, try to defer the decision until
58900 after writing code for this common subtask sequence. If one alternative
59000 exactly matches the common sequence, we can temporarily assert that the
59100 decision has been made to do this simplest being.
59200
59300 To terminate an AND loop
59400 The conjunction is usually between highly similar objects. Related to how
59500 to parse a sentence containing ANDs.
59600
59700 To resolve a BOOLEAN decision
59800 Ask the user to respond Yes or No. The decision has special Yes, No, and
59900 Both parts; in each case, two of them will be evalled.
60000
60100 To resolve a DEFINITION decision
60200 Locate the defined object, reaffirm that it is undefined, ask the user to
60300 define it, check whether it is a predicate, data structure, etc., and tell
60400 the user about such constraints.
60500
60600 To resolve a DICHOTOMY decision
60700 Each such decision contains a little program which now is evaluated. If
60800 it succeeds, its value answers the decision; if it fails, we have to ask
60900 the user to choose alternative 1 or 2. The choice points to a little
61000 program to run, and its value is the desired resolution.
61100
61200 To resolve a PREDICATE decision
61300 Fix the context of the predicate call, try to relate this predicate to some
61400 known tests, and, if this fails, ask the user. His response is specially
61500 processed in the context of a predicate fixation.
61600
61700 To set up a data structure stored as a LIST STRUCTURE
61800 The initialization is a simple SETQ to an as-yet undefined value.
61900 Access is simply by mentioning the name. Insertion is by Merge:in or
62000 Merge. Deletion is by Pullout or Setdifference.
62100
62200 To set up a data structure stored as a PROPERTY LIST STRUCTURE.
62300 Delineate the pieces to be associated with each atom, and name them.
62400 Accession is via GETP. Insertion s via a GETP, then a MERGE:IN, then a
62500 PUT. Deletion is via a GETP, then a PULLOUT, then a PUT. Initialization
62600 is a PUT of a SETQ to an as-yet undefined value. Each named substructure
62700 must itself become a data structure, typically a simple list structure.
62800
62900 To resolve a SOMEOF decision
63000 The user must be sufficiently informed about the choices. He types back
63100 a non-null, ordered subset of those choices. We examine them to see if
63200 they should be enclosed in a PROGN or in a REPEATEDLY, and whether we
63300 would like to see something structured in a certain way.
63400
63500 To resolve a SUBSETOF decision
63600 Similar to above, but no fancy investigation, just string the choices
63700 together in a PROGN.
63800
63900 To defer any decision whose WHEN part is provided
64000 We transform "before deciding firmly how to ←x ←←y" into something like
64100 "member (cons detail-of-$x-ing $y) doing-pup-list". We also translate
64200 Before any ←←x routine is finalized, After ←←x routines are finalized,
64300 and similar phrases. These must always evaluate T or NIL.
64400
64500
64600 G. Demons and functions of interest
64700
64800 LIST:JOIN, MERGE:IN, PULLOUT
64900 These rather standard functions are gicen a tiny bit of advice: if their
65000 "element" is more like a sublist than an element of their "list", then
65100 they assume that what was meant was append or setidifference, not cons
65200 or merge or remove.
65300
65400 LONG NAME DEMON
65500 If any name becomes unwieldy, he notices it and asks the user for a
65600 nickname.
65700
65800 PERMIT DETAILED DECISION
65900 Implicit near the beginning of each decision, PUP called this function.
66000 It asks the user if he wnats more details, and if so it gets them and
66100 prints them out.
66200
66300 STRUCTURE INDUCING DEMON
66400 If the object is a being already, special considerations apply.
66500 If the object and all arg values are ill-defined, we decide not to do
66600 any structuring. Otherwise, we investigate the effects of structuring
66700 the object into the pieces specified in the args. If there is no problem,
66800 and if the user consents, we tack the appropriate Replace messages onto the
66900 coding warning list (with a high priority). We activate Long Name Demon.
67000
67100 Natural Language Translation
67200 We have already discussed the TRANSLATE being and the basic way of handling
67300 natural language input. Several beings exist primarily for this purpose;
67400 RECOGNIZE:CONDITIONAL, RECOGNIZE:CONJUNCTION, RECOGNIZE:EQUALITY,
67500 RECOGNIZE:FUNCTION:RETURNS, RECOGNIZE:INCLUSION, RECOGNIZE:LITERALS,
67600 RECOGNIZE:SET:RELATIONS, RECOGNIZE:SOME:MEMBER, ADD:DEFINITION,
67700 ADJECTIVE:HANDLER, REPEATEDLY. Also, there are several functions
67800 related to translation: e.g., UNGERUNDIFY, PLURAL, OPPOSITE.
67900 All these are straight-forward, and their task is obvious from their name.